Privacy Preserving Clustering on Distorted data

نویسنده

Thanveer Jahan

چکیده

In designing various security and privacy related data mining applications, privacy preserving has become a major concern. Protecting sensitive or confidential information in data mining is an important long term goal. An increased data disclosure risks may encounter when it is released. Various data distortion techniques are widely used to protect sensitive data; these approaches protect data by adding noise or by different matrix decomposition methods. In this paper we primarily focus, data distortion methods such as singular value decomposition (SVD) and sparsified singular value decomposition (SSVD). Various privacy metrics have been proved to measure the difference between original dataset and distorted dataset and degree of privacy protection. The data mining utility k-means clustering is used on these distorted datasets. Our experimental results use a real world dataset. An efficient solution is achieved using sparsified singular value decomposition and singular value decomposition, meeting privacy requirements. The accuracy while using the distorted data is almost equal to that of the original dataset. KeywordsPrivacy Preserving, Data Distortion, Singular Value Decomposition (SVD), Sparsified Singular Value Decomposition (SSVD), k--means clustering.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SVD based Data Transformation Methods for Privacy Preserving Clustering

Nowadays privacy issues are major concern for many government and other private organizations to delve important information from large repositories of data. Privacy preserving clustering which is one of the techniques emerged to addresses the problem of extracting useful clustering patterns from distorted data without accessing the original data directly. In this paper two hybrid data transfor...

متن کامل

A Fuzzy Based Approach for Privacy Preserving Clustering

Extracting previously unknown patterns from huge volume of data is the primary objective of any data mining algorithm. In recent days there is a tremendous growth in data collection due to the advancement in the field of information technology. The patterns revealed by data mining algorithm can be used in various domains like Image Analysis, Marketing and weather forecasting. As a side effect o...

متن کامل

Repeated Record Ordering for Constrained Size Clustering

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...

متن کامل

Data Security Using Decomposition

Protection of privacy from unauthorized access is one of the primary concerns in data use, from national security to business transactions. It brings out a new branch of data mining, known as Privacy Preserving Data Mining (PPDM). Privacy-Preserving is a major concern in the application of data mining techniques to datasets containing personal, sensitive, or confidential information. Data disto...

متن کامل